Goto

Collaborating Authors

 visible image


Visible and Infrared Image Fusion Using Encoder-Decoder Network

Ataman, Ferhat Can, Akar, Gözde Bozdaği

arXiv.org Artificial Intelligence

The aim of multispectral image fusion is to combine object or scene features of images with different spectral characteristics to increase the perceptual quality. In this paper, we present a novel learning-based solution to image fusion problem focusing on infrared and visible spectrum images. The proposed solution utilizes only convolution and pooling layers together with a loss function using no-reference quality metrics. The analysis is performed qualitatively and quantitatively on various datasets. The results show better performance than state-of-the-art methods. Also, the size of our network enables real-time performance on embedded devices. Project codes can be found at \url{https://github.com/ferhatcan/pyFusionSR}.


DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion

Xu, Jian, He, Xin

arXiv.org Artificial Intelligence

Infrared and visible image fusion aims to combine complementary information from both modalities to provide a more comprehensive scene understanding. However, due to the significant differences between the two modalities, preserving key features during the fusion process remains a challenge. To address this issue, we propose a dual-branch feature decomposition fusion network (DAF-Net) with domain adaptive, which introduces Multi-Kernel Maximum Mean Discrepancy (MK-MMD) into the base encoder and designs a hybrid kernel function suitable for infrared and visible image fusion. The base encoder built on the Restormer network captures global structural information while the detail encoder based on Invertible Neural Networks (INN) focuses on extracting detail texture information. By incorporating MK-MMD, the DAF-Net effectively aligns the latent feature spaces of visible and infrared images, thereby improving the quality of the fused images. Experimental results demonstrate that the proposed method outperforms existing techniques across multiple datasets, significantly enhancing both visual quality and fusion performance. The related Python code is available at https://github.com/xujian000/DAF-Net.


Infrared Image Super-Resolution: Systematic Review, and Future Trends

Huang, Yongsong, Miyazaki, Tomo, Liu, Xiaofeng, Omachi, Shinichiro

arXiv.org Artificial Intelligence

Image Super-Resolution (SR) is essential for a wide range of computer vision and image processing tasks. Investigating infrared (IR) image (or thermal images) super-resolution is a continuing concern within the development of deep learning. This survey aims to provide a comprehensive perspective of IR image super-resolution, including its applications, hardware imaging system dilemmas, and taxonomy of image processing methodologies. In addition, the datasets and evaluation metrics in IR image super-resolution tasks are also discussed. Furthermore, the deficiencies in current technologies and possible promising directions for the community to explore are highlighted. To cope with the rapid development in this field, we intend to regularly update the relevant excellent work at \url{https://github.com/yongsongH/Infrared_Image_SR_Survey


Visible and NIR Image Fusion Algorithm Based on Information Complementarity

Li, Zhuo, Li, Bo

arXiv.org Artificial Intelligence

Visible and near-infrared(NIR) band sensors provide images that capture complementary spectral radiations from a scene. And the fusion of the visible and NIR image aims at utilizing their spectrum properties to enhance image quality. However, currently visible and NIR fusion algorithms cannot well take advantage of spectrum properties, as well as lack information complementarity, which results in color distortion and artifacts. Therefore, this paper designs a complementary fusion model from the level of physical signals. First, in order to distinguish between noise and useful information, we use two layers of the weight-guided filter and guided filter to obtain texture and edge layers, respectively. Second, to generate the initial visible-NIR complementarity weight map, the difference maps of visible and NIR are filtered by the extend-DoG filter. After that, the significant region of NIR night-time compensation guides the initial complementarity weight map by the arctanI function. Finally, the fusion images can be generated by the complementarity weight maps of visible and NIR images, respectively. The experimental results demonstrate that the proposed algorithm can not only well take advantage of the spectrum properties and the information complementarity, but also avoid color unnatural while maintaining naturalness, which outperforms the state-of-the-art.


Pedestrain detection for low-light vision proposal

Chang, Zhipeng, Ma, Ruiling, Jia, Wenliang

arXiv.org Artificial Intelligence

The demand for pedestrian detection has created a challenging problem for various visual tasks such as image fusion. As infrared images can capture thermal radiation information, image fusion between infrared and visible images could significantly improve target detection under environmental limitations. In our project, we would approach by pre-processing our dataset with image fusion technique, then using Vision Transformer (ViT) [5] model to detect pedestrians from the fused images. During the evaluation procedure, a comparison would be made between YOLOv5 and the revised ViT model's performance on our fused images.


Application of image-to-image translation in improving pedestrian detection

Patel, Devarsh, Patel, Sarthak, Patel, Megh

arXiv.org Artificial Intelligence

The lack of effective target regions makes it difficult to perform several visual functions in low intensity light, including pedestrian recognition, and image-to-image translation. In this situation, with the accumulation of high-quality information by the combined use of infrared and visible images it is possible to detect pedestrians even in low light. In this study we are going to use advanced deep learning models like pix2pixGAN and YOLOv7 on LLVIP dataset, containing visible-infrared image pairs for low light vision. This dataset contains 33672 images and most of the images were captured in dark scenes, tightly synchronized with time and location.


Visible-Infrared Person Re-Identification Using Privileged Intermediate Information

Alehdaghi, Mahdi, Josi, Arthur, Cruz, Rafael M. O., Granger, Eric

arXiv.org Artificial Intelligence

Visible-infrared person re-identification (ReID) aims to recognize a same person of interest across a network of RGB and IR cameras. Some deep learning (DL) models have directly incorporated both modalities to discriminate persons in a joint representation space. However, this cross-modal ReID problem remains challenging due to the large domain shift in data distributions between RGB and IR modalities. This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains (i.e., RGB and IR modalities) during training. This intermediate domain is considered as privileged information (PI) that is unavailable at test time, and allows formulating this cross-modal matching task as a problem in learning under privileged information (LUPI). We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model through an intermediate domain adaptation. In particular, by employing color-free and multi-step triplet loss objectives during training, our method provides common feature representation spaces that are robust to large visible-infrared domain shifts. Experimental results on challenging visible-infrared ReID datasets indicate that our proposed approach consistently improves matching accuracy, without any computational overhead at test time.


Machine learning creates full-colour images from infrared cameras – Physics World

#artificialintelligence

Infrared night-vision systems that see in colour could be a reality thanks to researchers in the US, who have used machine learning to create colour images of photographs that are illuminated with just infrared light. The team hope their technique could be further developed to create imaging systems that operate where the use of visible light is impossible, such as retinal surgery. Traditional night vision systems work by illuminating an area with near infrared radiation and detecting the reflections or by using ultrasensitive cameras to detect the small amount of light present even at night. Both, however, usually produce monochromatic images, so researchers are seeking ways to produce multi-colour images of objects without having to bathe them in visible light. Computer scientist Pierre Baldi of University of California, Irvine (UCI), explains that this would be very useful in medical applications where use of visible light is problematic.


Res2NetFuse: A Fusion Method for Infrared and Visible Images

Song, Xu, Wu, Xiao-Jun, Li, Hui, Sun, Jun, Palade, Vasile

arXiv.org Artificial Intelligence

This paper presents a novel Res2Net-based fusion framework for infrared and visible images. The proposed fusion model has three parts: an encoder, a fusion layer and a decoder, respectively. The Res2Net-based encoder is used to extract multi-scale features of source images, the paper introducing a new training strategy for training a Res2Net-based encoder that uses only a single image. Then, a new fusion strategy is developed based on the attention model. Finally, the fused image is reconstructed by the decoder. The proposed approach is also analyzed in detail. Experiments show that our method achieves state-of-the-art fusion performance in objective and subjective assessment by comparing with the existing methods.


Heterogeneous Visible-Thermal and Visible-Infrared Face Recognition using Unit-Class Loss and Cross-Modality Discriminator

Cheema, Usman, Ahmad, Mobeen, Han, Dongil, Moon, Seungbin

arXiv.org Artificial Intelligence

Abstract--Visible-to-thermal face image matching is a challenging variate of cross-modality recognition. The challenge lies in the large modality gap and low correlation between visible and thermal modalities. Existing approaches employ image preprocessing, feature extraction, or common subspace projection, which are independent problems in themselves. In this paper, we propose an end-to-end framework for cross-modal face recognition. The proposed algorithm aims to learn identity-discriminative features from unprocessed facial images and identify cross-modal image pairs. A novel Unit-Class Loss is proposed for preserving identity information while discarding modality information. In addition, a Cross-Modality Discriminator block is proposed for integrating image-pair classification capability into the network. The proposed network can be used to extract modality-independent vector representations or a matching-pair classification for test images. Our cross-modality face recognition experiments on five independent databases demonstrate that the proposed method achieves marked improvement over existing state-of-the-art methods. The applications of facial recognition (FR) systems have increased exponentially with the advent of deep convolutional neural networks. Automated FR is being used in personal devices, public surveillance, access control, security, marketing, and other applications. FR rates on visible images have increased considerably in the past few years. However, there are limitations in using FR in scenarios involving extreme variations in illumination, expressions, pose, presentation attacks, and disguises [1-3]. Extravisible imaging technologies are being adapted to overcome the limitations of visible imaging.